Skip to content

Conversation

jcrussell
Copy link

Searches for "Nullsoft" in the manifest to avoid false positives. Possibly too strict.

Fixes #1249

@qkaiser qkaiser self-assigned this Sep 4, 2025
@qkaiser qkaiser self-requested a review September 4, 2025 07:45
@qkaiser qkaiser added enhancement New feature or request format:executable python Pull requests that update Python code labels Sep 4, 2025
@qkaiser
Copy link
Contributor

qkaiser commented Sep 4, 2025

@jcrussell you should also create integration tests to check that the handler works as expected.

You have to create the following directories:

  • unblob/tests/integration/executable/pe/__input__
  • unblob/tests/integration/executable/pe/__output

I would put the following in the input directory:

  • a normal PE file
  • a normal PE file with prefix and suffix padding
  • a nullsoft PE file
  • a nullsoft PE file with prefix and suffix padding

To generate the output directory content, run the following:

find unblob/tests/integration/executable/pe/__input__ -type f -exec unblob -f -k -e unblob/tests/integration/executable/pe/__output__ {} \;

@qkaiser
Copy link
Contributor

qkaiser commented Sep 29, 2025

@jcrussell any update on this ? do you need assistance ?

@jcrussell
Copy link
Author

@jcrussell any update on this ? do you need assistance ?

@qkaiser: I believe the code is close to final. Do you mind adding the integration test data? It is easier for me to release code than data. Here's what I have been testing with:

Thanks in advance!


return ValidChunk(
start_offset=start_offset,
end_offset=start_offset + binary.original_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original_size is the file size on disk, not the actual PE file size. Samples with suffix are carved with the suffix, which is incorrect. I'm looking into it.

@qkaiser
Copy link
Contributor

qkaiser commented Oct 15, 2025

@jcrussell had to figure out how to handle LFS on forks, looks like it's okay now. Made some adjustments to keep pyright happy given LIEF's ability to return completely different types for the same object.

We need to fix the way the end offset is calculated, it'll probably be based on sections size and header size. Without unblob considers everything after the PE as part of the PE chunk.

…table

Add support for PE file by relying on LIEF to parse PE file once matched
on 'MZ' or 'PE' signature.

If the file is a self-extractable NSIS executable
("Nullsoft.NSIS.exehead" present in manifest) we extract it with 7zip.

Co-authored-by: Quentin Kaiser <[email protected]>
@jcrussell
Copy link
Author

Thanks for moving this along!

We need to fix the way the end offset is calculated, it'll probably be based on sections size and header size. Without unblob considers everything after the PE as part of the PE chunk.

I started looking into this:

>>> pe = lief.PE.parse("tests/integration/executable/pe/__input__/nsis-3.11-setup.exe")
>>> pe.original_size
1564991
>>> pe.sizeof_headers
1024
>>> sum([s.sizeof_raw_data for s in pe.sections])
52224
>>> sum([v.size for v in pe.data_directories])
19456

Found this script that dumps a bunch of info, going to try a more complete look at all the parts tomorrow.

https://github.com/lief-project/LIEF/blob/main/api/python/examples/pe_reader.py

@jcrussell
Copy link
Author

This works for (some) non-NSIS PEs but trims off the data that NSIS adds after the PE that contains what we actually want to extract. The "trimmed" data is not recognized by any handler. It seems like we need to detect if it's a NSIS installer incalculate_chunk to see if we need to increase the size for the NSIS data.

        size = sum([s.sizeof_raw_data for s in binary.sections]) + binary.sizeof_headers

        return ValidChunk(
            start_offset=start_offset,
            end_offset=start_offset + size
        )   

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request format:executable python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support NSIS Installers

2 participants